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Abstract 


The  shapes  of  naturally  occurring  objects  characteristically  involve  spatial  events 
occurring  at  many  scales.  It  is  important  to  make  explicit  the  multiscale  structure  of 
a  shape  object  in  order  to  effectively  perform  shape  recognition  or  to  engage  in  other 
forms  of  reasoning  about  shape.  Currently  available  techniques  for  multiscale  shape 
analysis  include  image  blurring  and  contour  smoothing;  each  of  these  techniques 
involves  uniform  application  of  a  smoothing  operator  to  the  entire  image  array  or 
contour.  This  paper  offers  a  new,  symbolic,  approach  to  constructing  a  primitive  shape 
description  across  scales  for  2d  binary  (silhouette)  shape  images.  Under  this  approach, 
grouping  operations  are  performed  over  collections  of  tokens  residing  on  a  Scale-Space 
Blackboard.  Two  types  of  grouping  operations  are  identified  that,  respectively:  (1) 
aggregate  edge  primitives  at  one  scale  into  edge  primitives  at  a  coarser  scale,  and 
(2)  group  edge  primitives  into  partial-region  assertions,  including  curved-contours, 
primitive-corners,  and  bars.  Algorithms  to  perform  these  computations  are  presented. 
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1  Introduction 

The  shapes  of  naturally  occurring  objects  characteristically  involve  spatial 
events  occurring  at  a  multitude  of  scales.  For  example,  the  fish  shape  in  figure 
1  appears  at  a  coarse  scale  simply  as  an  elongated  blob;  at  a  medium  scale 
as  a  somewhat  more  well-defined  blob  with  smaller  blobs  (una)  attached, 
and  finally,  at  a  fine  scale,  as  a  sharply  defined  Anchovy  complete  with 
pronounced  fin  contours,  pointed  tail  flukes,  and  a  mouth.  Shape  details 
appearing  at  finer  scales  are  situated  in  relation  to  one  another  by  the  spatial 
structure  emergent  at  coarser  scales.  It  is  important  to  make  explicit  the 
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Figure  l.  Important  shape  features  occur  at  many  scales. 


multiscale  structure  of  a  shape  object1  in  order  to  effectively  perform  shape 
recognition  or  to  engage  in  other  forms  of  reasoning  about  shape  because 
important  distinguishing  characterist  ics  or  features  may  occur  at  any  scale. 

For  this  reason  one  widely  cited  goal  for  early  visual  shape  processing  is 
to  construct  a  description  of  a  shape  at  a  variety  of  scales  [Witkin.  19S3: 
Mokhtarian  and  Nlackworth,  1986;  Asada  and  Brady,  1986;  Pizer  et  al,  1986; 
Koenderink,  1984;  Burt  and  Adelson,  1983;  Crowley  and  Parker,  1984;  Crow¬ 
ley  and  Sanderson,  1984;  Sammet  and  Rosenfeld,  1980].  From  these  descrip¬ 
tions  may  be  extracted  important  primitive  shape  events  to  be  used  by  later 
stages  devoted  to  object  recognition  or  other  visual  tasks.  This  paper  is  con¬ 
cerned  with  building  multiscale  shape  descriptions  of  two  dimensional  binary 
(silhouette)  shape  images  in  terms  of  edge  and  region  (blob)  shape  primitives. 

Currently  available  techniques  for  multiscale  shape  analysis  are  of  two 
basic  types:  contour-based  smoothing  and  region-based  smoothing.  Both  of 
these  approaches  are  based  on  the  application  of  a  numerical  smoothing  oper¬ 
ator  uniformly  to  some  one-dimensional  (contour-based)  or  two-dimensional 
(region-based)  array  of  shape  data.  The  operator  is  typically  characterized 
by  a  size  or  width  parameter  indicating  the  degree  of  smoothing  performed 
arid  hence  the  scale  of  the  result.  Region-based  smoothing  techniques  may 
ire  further  subdivided  into  isotropic  smoothing  operators,  and  oriented  fil¬ 
ters.  As  will  be  shown,  at  coarse  scales  both  contour-based  smoothing  and 
isotropic  region  smoothing  approaches  fail  to  capture  in  a  consistent  manner 
important  structure  inherent  to  shape  objects.  The  prospects  for  oriented 
filters  are  uncertain. 

This  paper  describes  a  fundamentally  different  approach  to  extracting 
primitive  shape  descriptions  at  multiple  scales.  The  approach  is  based  on 
grouping  of  shape  tokens  in  the  style  of  the  Primal  Sketch  [Marr,  1976].  Each 
token  may  bear  more  information  than  just  the  local  magnitude  of  an  image 
intensity  or  local  orientation  of  a  contour.  The  approach  may  be  considered 
symbolic  because  the  tokens  are,  conceptually,  discrete  entities,  and  because 
the  grouping  steps  actually  taken  depend  necessarily  on  the  shape  data  itself. 
This  is  in  contrast  to  uniform  numeric  smoothing  algorithms  which  carry  out 
the  same  arithmetic  procedure  everywhere  regardless  of  the  shape  content  of 
the  data. 

An  important  tool  we  introduce  for  carrying  out  the  grouping  operations 
'We  refer  to  a  figure  whose  shape  we  are  analyzing  as  a  shape  object. 
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is  the  Scale-Space  Blackboard.  Tokens  are  placed  on  the  Blackboard  accord¬ 
ing  to  their  location,  orientation,  and  scale.  The  Scale-Space  Blackboard 
facilitates  manipulation  of  shape  information  because  it  permits  tokens  to  be 
indexed  on  the  basis  of  location  and  scale. 

The  grouping  procedures  specify  situations  under  which  a  collection  of 
tokens  should  give  rise  to  a  new  token.  Two  types  of  grouping  operation 
are  presented:  (1)  r:ue-to-coarse  aggregation  of  edge  primitives  generate-  a 
coarser  scale  edge  map  from  finer  scale  edge  primit  ives.  (2)  Pairwise  grouping 
of  symmetrically  placed  edge  primitive  tokens  supports  assertions  of  curved- 
contour ,  primitive-corner ,  and  bar  events,  all  of  which  demark  partial-regions. 
These  events  are  marked  by  partial-region  type  tokens  placed  on  the  Scale- 
Space  Blackboard. 

The  outline  of  the  paper  is  as  follows:  The  remainder  of  the  Introduction 
explores  characteristics  desired  of  a  nultiscale  shape  representation.  Sec¬ 
tions  2.1  and  2.2  briefly  illustrate  disadvantages  of  contour-based  smoothing 
and  isotropic  region  based  smoothing  approaches  to  identifying  important 
coarse  scale  structure  in  shape  images,  while  Section  2.3  shows  that  oriented 
edge  filters  offer  some  improvement  over  isotropic  region-based  smoothing 
operators.  Section  3  introduces  the  Scalt  -Space.  Blackboard  as  a  data  struc¬ 
ture  which  allows  shapes  to  be  manipulated  symbolically,  while  preserving  a 
pictorial  quality  to  the  organization  of  spatial  information.  Section  1  offers 
an  algorithm  for  fine-to-coarse  aggregation  of  edge  primitives  through  token 
grouping.  Section  5  presents  rules  for  grouping  edge  primitives  in  order  to 
identify  more  complex  structures  constituting  partial-regions. 

1.1  Objectives  for  Multiple  Scale  Shape  Representa¬ 
tion 

The  motivation  for  describing  shapes  at  multiple  scales  is  to  separate  geomet¬ 
ric  features  and  properties  of  differing  size  or  scale,  on  the  assumption  that 
they  are  likely  to  reflect  different  parts,  processes,  or  functional  properties 
of  objects  encountered  in  the  visual  world.  For  example,  the  body  and  stem 
of  an  apple  are  related  to  one  another  by,  among  other  things,  a  difference 
in  relative  size.  If  the  early  stages  of  visual  processing  can  deliver  object  de¬ 
scriptions  making  explicit  relative  sizes,  then  later  stages  of  processing,  such 
as  visual  recognition,  may  be  assisted  in  carrying  out  tasks  such  as  matching 


these  descriptions  to  internal  models  of  known  objects:  An  apple  consists  of 
a  large  blob  (body)  with  a  small  elongated  part  (stem)  attached. 

In  evaluating  the  performance  of  a  multiple  scale  shape  description,  it  is 
important  to  have  established,  at  the  outset,  expectations  for  just  what  sorts 
of  geometric  structure  the  computation  is  intended  to  segregate  according  to 
size  or  scale.  We  proceed  from  the  following  notion:  size  or  scale  corresponds 
to  spatial  extent  in  the  image  of  a  shape  object.  Thus,  the  body  of  an  apple  is 
considered  a  larger  scale  feature  than  the  stem  because  it  has  greater  spatial 
extent. 

To  be  more  precise,  however,  the  term,  “spatial  extent,”  may  be  inter¬ 
preted  in  either  of  two  ways:  as  linear  distance,  or  as  area.  It  is  clear  that  the 
body  of  an  apple  is  a  large  scale  feature  relative  to  the  stem,  both  because 
its  diameter  is  larger  than  the  length  of  the  stem,  and  because  it  has  greater 
area  than  the  stem.  But  suppose  the  apple  is  hanging  from  a  string.  (See 
figure  2).  The  string  may  have  a  length  comparable  to  the  diameter  of  the 
apple,  but,  because  of  its  narrow  width,  cover  an  area  more  similar  to  that 


Figure  2.  A  two-dimensional  apple  shape  (a)  retains  its  fine  and  coarse  scale 
structure  even  when  the  apple  hangs  from  a  string  (b)  and  when  the  apple 
is  placed  near  another  large  object  (c).  d.  The  large  scale  figure/ground 
boundary  formed  by  the  top  of  the  apple  remains  unchanged  under  these 
circumstances. 


of  the  stem.  So  should  the  string  be  considered  a  large  or  small  scale  spatial 
event? 

This  example  suggests  that  a  multiscale  shape  representation  treat  object, 
boundaries  differently  from  the  regions  they  enclose.  Tims,  the  scale  assigned 
to  a  contour  boundary,  such  as  the  edge  of  a  piece  of  string,  should  depend 
on  its  linear  extent,  while  the  scale  assigned  to  a  local  blob  or  region,  such 
as  the  body  of  the  apple  or  a  snippet  of  string,  should  depend  upon  its  area. 

If  the  purpose  of  a  multiscale  shape  description  is  to  segregate  features 
according  to  scale,  then  shape  events  at  different  scales  should  not  inter¬ 
fere  with  one  another.  For  example,  the  rounded  top  of  an  apple  forms  a 
large  scale  boundary  between  the  body  of  the  apple  and  the  background,  as 
shown  in  figure  2d.  The  presence  of  the  small  scale  apple  stem,  or  even  the 
string,  does  not  change  this  gross  feature,  and  the  coarse  scale  description 
of  this  boundary  should  not  be  affected  by  the  presence  or  absence  of  the 
stem  or  string.  Conversely,  the  descript  ion  of  smaller  scale  shape  features  or 
properties  should  remain  unchanged  no  matter  what  their  proximity  to  large 
features.  For  example,  were  the  apple  placed  next  to  another,  much  larger 
object,  the  body  of  the  apple  would  become,  in  <  mparison,  a  small  scale 
object  (figure  2c).  Nonetheless,  the  description  of  the  apple  body  should 
remain  unaffected;  the  apple  is  still  a  roughly  circular  blob  with  dimples  on 
the  top  and  bottom. 


2  Uniform  Numerical  Smoothing  Methods 


A  two-dimei,  ional  region,  and  the  one-dimensional  contour  enclosing  this 
region,  are  complementary  ways  of  describing  a  two-dimensional  shape  ob¬ 
ject.  Accordingly,  two  alternative  schemes  are  available  for  representing  a 
shape  object  at  the  pixel  level:  as  a  two-dimensional  array  indexed  by  x,y 
spatial  coordinates,  or,  as  a  one  dimensional  array  indexed  by  distance  along 
the  contour,  .s.  With  each  type  of  representation  are  associated  natural  ap¬ 
proaches  to  obtaining  descriptions  at.  different,  scales  by  applying  some  form 
of  numerical  smoothing  technique  uniformly  to  the  data. 
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2.1  Contour-Based  Smoothing 

Contour  based  shape  representations  organize  the  description  of  a  shape  in 
terms  of  a  succession  of  points  along  an  object’s  boundary.  Several  variations 
of  contour  based  shape  representation  have  been  used.  These  include  encod¬ 
ing  of:  (1)  successive  pixel  (x,y)  location,  eg.  [Mokhtarian  and  Mackworth. 
1986],  (2)  differences  in  successive  pixel  locations  (Ax,  At/),  eg.  [Freeman. 
1974],  and  (3)  local  orientation  ( arctan eg.  [Asada  and  Brady,  1986]. 
Contour  smoothing  operations  modify  the  path  of  the  two-dimensional  con¬ 
tour  curve  in  space,  and  sometimes  also  its  length.  Here  we  illustrate  contour 
based  smoothing  under  the  technique  of  encoding  pixel  ( x,  y )  location  as  a 
function  of  arc  length,  s  (measured  in  terms  of  pixel  count),  and  smoothing 
the  x(s)  and  y(s)  functions  independently: 

q< 7 

x’{$)  =  G„(i)x(s-i)  (1) 

t=  — a<7 
q<7 

y'(s)  =  Y1  G„(i)y(s  —  i),  (2) 

«=  —  q<7 

whore  G  is  a  Gaussian  of  width  a  and  the  factor,  a,  effectively  truncates 
the  tail  of  the  Gaussian  (a  =  3  is  a  suitable  number).  Under  this  scheme  a 
dosed  contour  is  guaranteed  (■>  remain  closed  after  smoothing,  while  this  is 
not  true  for  representations  of  oricn  ation  versus  arc  length.  Figure  3  shows 
the  contour  of  an  apple  shape  inde  dilferert  degrees  of  contour  smoothing 
obtaineu  by  using  Ganssians  of  various  widths. 
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Figure  3.  Apple  shape  encoded  in  terms  of  pixels  along  its  bounding  contour. 
x(s)  and  t/(s).  Smoothing  these  one- dimensional  arrays  yields  a  smoothed 
shape  contour. 


For  some  shape  objects,  contour-based  smoothing  do(>s  a  good  job  of 
removing  fine  scale  detail  while  preserving  the  larger  scale  aspects  of  the 
shape.  Indeed,  the  apple  is  one  example  of  such  a  case.  However,  many  other 
shapes  exist  for  which  contour  smoothing  fails  to  identify  important  coarse 
scale  structure,  or  else  inappropriately  suggests  the  presence  of  nonexistent 
coarse  scale  structure.  Figure  4  illustrates.  To  the  human  eye,  in  figure  la 
two  parallel  bars  are  prominent;  under  contour  smoothing  one  of  the  bars 
remains  at  a  coarse  scale,  while  the  other  breaks  up.  In  figure  4b,  the  apple  is 
shown  hanging  from  a  string.  Contour  smoothing  to  a  coarse  scale  results  in 
misleading  distortion  and  absurd  implications  about  the  gross  shape.  These 
effects  can  create  hardships  for  any  later  processing  stages  which  may  seek  to 
perform  part  segmentation,  match  to  object  models,  or  otherwise  interpret 
coarser  scale  shape  descriptions.  A  related  problem  arising  with  contour- 
based  smoothing  occurs  in  figure  4c.  Here,  a  banana  is  placed  near  the  apple. 
A  very  small  change  in  shape,  resulting  from  the  banana  being  moved  a  little 
closer  to  the  apple,  leads  to  a  very  large  change  in  the  coarsely  smoothed 
contour. 

As  these  examples  show,  contour  based  representations  place  undue  em¬ 
phasis  on  the  topology  of  shape  boundaries.  The  resulting  descriptive  in¬ 
stabilities  are  likely  to  introduce  insurmountable  complications  later  on.  YVe 
conclude  that  purely  contour- based  smoothing  approaches  do  not  provide  an 
appropriate  basis  for  constructing  multiscale  shape  descriptions. 

2.2  Isotropic  Region-Based  Smoothing 

Region  based  smoothing  techniques  start  with  representations  for  shape  con¬ 
sisting  of  two-dimensional  arrays  of  numbers.  A  two-dimensional  shape  ob¬ 
ject  (silhouette)  assigns  the  value,  (say)  1,  to  locations  in  a  two-dimensional 
array  covered  by  the  object  (figure),  and  0  to  the  surrounding  space  (ground). 
In  general,  filtering  a  two-dimensional  array  of  binary- valued  pixels  results 
in  an  array  containing  real  numbers.  Each  such  grev-level  value  may  be 
interpreted  as  the  “strength”  of  the  filtering  kernel  response  at  that  location. 

Most  popular  among  region-based  smoothing  operators  is  convolution 
with  the  circularly  symmetric  Gaussian.  This  operator  is  spatially  isotropic, 
and  is  often  followed  by  a  differential  operator  such  as  the  Gradient  Mag¬ 
nitude  or  Laplacian.  The  latter  is  usually  incorporated  into  the  Gaussian 
smoothing  step,  yielding  the  w^ell  known  V2ff,  and  its  approximation,  the 


Figure  4.  a.  Contour  smoothing  fails  to  capture  the  large  scale  interpretation 
that  two  parallel  bars  are  present,  b.  Under  contour  smoothing,  a  string  tied 
to  the  apple  grossly  distorts  the  apple's  shape  at  coarse  scales,  c.  Moving  a 
banana  so  that  it  just  touches  the  apple  leads  to  a  large  and  discontinuous 
change  in  the  coar-e  scale  description.  Contour-based  smoothing  methods 
place  undue  emphasis  on  the  topology  of  hounding  contours. 
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DOG  (Difference  of  Gaussians).  The  outputs  of  these  filtering  operators  typi¬ 
cally  feed  some  sort  of  thresholding  step  resulting  in  edge'  [Marrand  Hildreth. 
1980:  Canny.  1986]  or  region/hlob  [Crowley  and  Sanderson.  IDS  1 ;  Crowley 
and  Parkin'.  1 DS  l :  Voorhees.  1987]  assertions. 

Figure  5  shows  the  result  after  Gaussian  smoothing  the  binary  silhouette 
of  an  apple  with  filters  of  various  widths.  Also  shown  are  edges  found  by 
thresholding  and  then  thinning  the  gradient  magnitude*.  Gaussian  smooth¬ 
ing  yields  a  field  of  numbers  that  may  be  interpreted  as  the  "density  of  mat¬ 
ter”  at  each  spatial  location,  averaged  in  all  directions.  The  edges  found  by 
taking  peaks  in  the  gradient  magnitude  of  this  map  do  a  good  job  of  remov¬ 
ing  small  scale  details  about  the  apple's  bounding  contour,  while  preserving 
its  overall,  large  scale  shape. 

Figures  6  and  7,  however,  show  that  the  isotropic  Gaussian  blurring  oper¬ 
ation  may  obliterate  evidence  of  extended  edges  when  they  occur  in  proximity 
to  large  yet  unrelated  regions  or  when  they  enclose  narrow  regions.  In  figure 
6,  the  string  tied  to  the  apple  is  lost  altogether  under  thresholding  following 
Gaussian  blurring.  Because  of  its  narrow  width,  it  dissipates  away  under 
even  moderate  amounts  of  blurring. 

The  converse  problem  arises  in  figure  7.  in  which  the  apple  shape  is  placed 
next  to  the  banana.  Now.  the  results  of  Gaussian  smoothing  and  coarse  scale 
edge  detection  yield  an  apparent  coarse  scale  contour  for  the  apple  shape  that 
is  substantially  different  from  the  one  obtained  in  figure  5.  What  happens  is 
that,  at  coarse  degrees  of  smoothing,  “matter”  from  the  banana  leaks  over  to 
the  region  of  the  apple.  Evidently,  under  Gaussian  blurring,  the  coarse  scale 
description  of  an  object’s  shape  cannot  be  trusted  to  remain  stable  under  the 
presence  of  nearby  objects,  even  when  no  object  occludes  any  other.  Again, 
as  in  the  contour  smoothing  case,  this  instability  effectively  undermines  the 
purpose  of  mult  iscale  shape  analysis. 


2This  is  the  foundation  of  the  popular  Canny  edge  detector. 
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2.3  Oriented  Region-Based  Filters 

Another  class  of  region  based  operators  for  extracting  events  at  multiple 
scales  are  oriented  filters,  such  as  the  Gabor  filters  [Daugman,  1985].  Here, 
we  illustrate  the  performance  of  oriented  edge  masks  consisting  of  a  Gaussian 
weighting  along  the  length  of  the  edge,  and  the  derivative  of  a  Gaussian  across 
the  edge  (figure  8)(see  [Zucker  and  Iverson,  1987],  who  use  the  2nd  derivative 
of  the  Gaussian).  Orientation  tuning  is  determined  by  the  relative  widths 
of  these  profiles.  Because  oriented  filters  carry  out  spatial  averaging  non- 
isotropically,  that  is,  depending  upon  the  orientation  and  eccentricity  of  the 
mask,  they  perhaps  stand  a  better  chance  of  achieving  smoothing  along  the 
length  of  a  contour,  while  isolating  regions  lying  on  opposite  sides  of  the 
contour. 

Figure  9  shows  the  results  of  oriented  edge  detection  for  the  apple  shape. 
The  filter  mask  was  convolved  with  the  original  binary  image  at  sixteen 
different  orientations  for  each  sc  ale,  and  yields  sixteen  grey  level  arrays  for 
each  scale.  In  order  to  facilitate  presentation,  it  is  convenient  to  condense  this 


Figure  8.  Oriented  two-dimensional  edge  mask. 


Figure  0.  Apple  shape  under  oriented  edge  filtering,  a.  I 
denote  orientations  of  edges  after  thinning  and  thresholding, 
filter  response  out  of  16  orientations. 


large  amount  of  information  into  two  array.'  of  uumb.-rs  lor  each  scale.  One 
(figure  9b)  depicts  the  strength  of  the  maximally  responding  filter  response, 
at  each  spatial  location,  the  other  (figure  9a)  --hows  the  orientation  of  the 
maximally  responding  filters  for  a  selected  subset  of  *- ; >.i t  ia  1  locations,  such  as. 
for  example,  locations  where  the  filter  respond  is  u!-.,ve  a  certain  threshold. 

Figure  10  indicates  that  the  performs  u  -e  of  ’monied  filters  in  identifying 
extended  edges  at  coarse  scales  is  improved  over  isot  topic  Gaussian  smooth¬ 
ing.  For  example,  in  the  absence  of  background  clutter,  the  string  is  detected 
at  fairly  coarse  scales  when  its  boundary  contour  aligns  with  the  orientation 
axis  of  the  elongated  mask. 

However,  figure  11  suggests  that  case's  yet  exist  where  oriented  edge  filters 
fail  to  identify  important  coarse  scale  edges.  One  source  of  difficulty  arises 
from  the  fact  that  large  aspect  ratios  may  be  required  to  detect  long  edges 
bounding  an  object  placed  very  near  to  another  object.  Such  greatly  elon¬ 
gated  filters  by  and  large  bring  severe  orientation  tuning,  and  an  inordinate 
number  of  them  may  be  required  to  cover  the  visual  field  at  all  orientations. 
It  is  not  clear  to  what  extent  this  problem  tarnishes  the  advantages  of  ori¬ 
ented  filters. 

Uniform  numerical  smoothing  techniques  are  conceptually  straightfor¬ 
ward  and  simple  to  apply,  but  these  in  themselves  amount  to  no  sound  bases 
for  believing  that  they  should  necessarily  extract  the  important  shape  prop¬ 
erties  that  later  visual  processes  can  most  effectively  use.  It  seems  possible, 
though,  that  oriented  filters  may  yet  offer  some  promise  for  finding  large  scale 
structure  in  shape  images.  We  leave  them  as  a  subject  for  additional  study, 
and  turn  next  to  a  very  different  approach  to  multiscale  shape  analysis. 


3  The  Scale-Space  Blackboard 
3.1  Tokens  vs.  Fields  of  Numbers 

The  purpose  of  a  shape  representation  is  to  distinguish,  identify,  and 
ch  ^.cterize — to  make  explicit — certain  shape  properties  and  spatial  events 
in  me  shape  image  that  are  likely  to  have  significance  either  in  the  exter¬ 
nal  world  or  to  the  system’s  task  goals.  By  highlighting  and  naming  these 
events,  important  information  can  be  more  easily  manipulated  by  later  pro¬ 
cesses  carrying  out  pattern  matching,  counting,  tracing,  perceptual  grouping, 
and  other  operations. 

Alternative  interpretations  are  available  for  what  it  takes  to  “make  infor¬ 
mation  explicit.”  In  the  case  of  typical  region-based  edge  detecting  filters, 
for  example,  “edgeness”  is  made  explicit  over  the  entire  image  in  the  form 
of  a  field  of  numbers  describing  the  response  strength  of  a  convolution  ker¬ 
nel  centered  at  each  pixel.  On  the  other  hand,  edge  information  may  also 
be  said  to  have  been  made  explicit  in  a  list  of  line  segments  fit  to  edges  in 
the  image.  The  former  representation  may  be  called  iconic ,  or  image-like 
[Pylyshyn,  1973,  1981;  Anderson,  1978;  Kosslyn,  et.  al.  1979],  while  the 
latter  is  considered  symbolic.  Most  approaches  to  later  shape  interpretation 
employ  symbolic  representations  because  they  offer  greater  flexibility  in  as¬ 
signing  meaningful  interpretations  to  parts  of  shape,  for  example,  that  “this 
edge  corresponds  to  the  stem  of  an  apple.” 

This  work  adopts  an  intermediate  representational  format  preserving  the 
spatial  character  of  an  iconic  representation  while  permitting  symbolic  tags 
to  be  attached  to  spatial  events  occurring  in  a  shape  image.  The  genus 
may  be  called  semi-iconic  representation.  Information  is  made  explicit  via 
symbolic  tokens.  Tokens  are  symbolic  in  that,  unlike  pixel  values,  each  token 
can  maintain  lists  of  properties,  pointers,  and  other  items  of  internal  state. 
Yet,  the  pictorial  aspect  of  spatial  geometry  is  preserved  by  the  assignment 
to  each  token  of  a  location  on  the  shape  image.  Furthermore,  as  is  discussed 
in  the  next  section,  the  tokens  may  be  indexed  by  spatial  location.  Not 
every  point  in  the  image  is  necessarily  covered  by  a  token,  however,  and 
some  locations  may  be  associated  with  more  than  one  token.  The  use  of 
tokens  in  making  explicit  important  image  events  was  introduced  by  Marr 
[1976,  1982]  in  his  proposal  of  the  Primal  Sketch  as  an  early  visual  image 
representation,  and  has  been  applied  to  multiscale  straight  line  extraction  by 
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Figure  12.  A  sharp  corner  may  he  continuously  deformed  into  a  flattened 
corner.  As  the  flattened  edge  gradually  disappears  at  some  point  a  decision 
must  he  made  that  a  corresponding  edge  token  should  no  longer  he  asserted. 
A  priori,  no  principled  grounds  exist  for  defining  the  derision  criteria. 

Weiss  and  Boldt  [1986]  (see  also  Boldt  and  Weiss,  [1987]). 

The  transition  from  an  iconic  to  a  symbolic  representation  raises  an  issue 
of  discretization.  Shapes  are  fundamentally  continuous  things.  Consider  the 
sharp  corner  shape  shown  in  figure  12e.  This  may  be  continuously  deformed 
into  a  flattened  corner,  figure  12a.  An  iconic  representation  has  no  trouble 
describing  shapes  anywhere  along  this  continuum  because  every  location  is 
assigned  some  pixel  value.  In  contrast,  a  symbolic  or  a  semi-iconic  represen¬ 
tation  is  inherently  discrete:  properties  are  asserted  only  for  locations  where 
a  symbol  or  token  has  been  assigned.  Any  time  a  discrete  representation  is 
to  be  computed  from  a  continuous  representation,  qualitative  decisions  must 
be  made  of  the  form,  “Should  we  put  a  token  here?”  Usually  this  decision  in¬ 
volves  the  use  of  some  threshold  value,  for  example,  “put  a  token  everywhere 
an  edge  is  present  stronger  than  x" . 

It  is  important  that  later  processes  performing  operations  on  discretized 
representations  not  rely  upon  the  presence  or  absence  of  tokens  that  might 
or  might  not  have  been  asserted  had  a  threshold  been  slightly  different.  This 
is  to  say,  it  is  desirable  for  a  shape  representation  to  preserve  the  continuous 
qualities  that  the  world  of  naturally  occurring  shapes  in  fact  displays.  We 
attempt  to  abide  by  this  principle  by  endowing  each  token  with  a  strength 
parameter3.  The  strength  parameter  indicates  to  roughly  what  degree  the 
shape  property  associated  with  a  token  is  asserted  at  that  token’s  partic¬ 
ular  location  in  the  image.  Later  processes  manipulating  the  information 
conveyed  by  shape  tokens  are  intended  to  achieve  independence  from  the 
instabilities  of  early  quantization  steps  by  modulating  their  computations 

Alternatively  this  may  be  called  a  response-strength  or  activity  parameter. 
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Figure  13.  An  edge  primitive  is  marked  bv  a  token.  The  edge  is  viewed 
as  having  spatial  extent  roughly  corresponding  to  a  gaussian  ellipsoid.  A 
primitive  edge  token  is  displayed  either  as  an  ellipse  (a),  or  as  a  line  segment 
with  a  circle  ;r  the  "front”  end  indicating  the  figure/ground  orientation  of 
the  <  dge  (b). 


according  to  the  tokens’  strength  parameters.  As  a  given  shape  property 
fades  from  significance  its  later  implications  can  have  waned  before  its  asso¬ 
ciated  token  disappears  entirely. 

The  primary  token  employed  in  building  multiscale  shape  descriptions 
is  the  edge  primitive.  In  addition  to  strength,  an  edge  primitive  possesses 
the  attributes  of  x  spatial  location ,  y  spatial  location,  orientation,  and  scale. 
The  primitive  edge  token  denotes  a  boundary  between  figure  and  ground 
occurring  approximately  along  its  length  axis,  in  much  the  same  way  as  that 
measured  by  the  oriented  edge  filter  shown  in  figure  8.  Though  its  token  is 
assigned  specific  (x,  y )  coordinates,  an  edge  primitive  is  to  be  interpreted  as 
asserting  information  about  some  elongated  local  region  as  shown  in  figure 
13.  The  edge  assertion  is  to  be  considered  strongest  at  the  center  of  the 
region,  and  it  diminishes  with  increasing  distance. 

3.2  Justification  for  Scale-Space 

Despite  their  deficiencies  in  extracting  coarse  scale  structure,  contour  based 
and  region  based  numeric  smoothing  techniques  deliver  identical  results  in 
the  limit  of  the  finest  scales  of  resolution.  For  example,  were  we  to  distribute 
edge-denoting  tokens  at  nearby  intervals  along  a  very  slightly  smoothed  ob- 


ject  boundary  contour,  these  would  agree  with  tokens  located  by  taking 
the  maximum  gradient  magnitude  following  slight  two-dimensional  Gaus¬ 
sian  smoothing.  Although  we  would  properly  label  these  as  fine  scale  edges, 
the  coarse  scale  structure  of  the  shape  remains  implicit  in  the  distribution 
of  tokens  about  the  image.  Our  goal  is  to  make  this  coarser  scale  structure 
explicit,  for  example  by  placing  appropriate  additional  tokens  on  an  image. 

The  approach  we  offer  to  computing  where  such  additional  tokens  might 
go  is  to  look  directly  at  patterns  of  smaller  scale  tokens  already  present.  The 
style  of  computation  corresponds  to  what  is  widely  known  as  a  “blackboard 
architecture”  in  the  Artificial  Intelligence  literature:  maintain  a  set  of  current 
assertions,  as  if  they  were  written  out  on  a  blackboard.  A  set  of  rules  or 
procedures  performs  pattern  matching  on  the  contents  of  the  blackboard, 
and  updates  these  contents  by  erasing,  adding,  and  modifying  assertions.  In 
the  present  case,  assertions  about  shape  are  made  by  placing  shape  tokens 
into  the  blackboard. 

3.2.1  Indexing  Spatial  Information  in  a  Blackboard 

A  number  of  important  design  choices  are  available  as  to  just  where  and  how 
various  aspects  of  shape  information  are  to  be  stored  and  organized,  using  a 
blackboard  architecture.  Note  that  having  two-dimensional  (as  in  a  physical 
blackboard)  or  n-dimensional  spatial  arrangement  is  only  an  optional  com¬ 
ponent  to  the  organization  of  blackboard  architectures  as  they  are  classically 
viewed. 

The  most  crucial  set  of  issues  revolves  around  the  means  provided  for 
indexing  into  the  blackboard,  that  is,  for  addressing  and  accessing  the  shape 
information  it  contains.  The  following  question  arises:  To  what  degree  is 
information  viewed  as  residing  “inside”  a  token,  and  to  what  degree  in  terms 
of  the  token’s  location  in  some  coordinate  system  defined  on  the  blackboard. 
To  illustrate,  the  information  borne  by  each  edge  token  could  be  written 
on  a  scrap  of  paper  tossed  in  a  heap;  one  examines  symbols  written  on  the 
scraps  to  read  off  tokens’  location  in  space,  orientation,  and  other  properties. 
The  blackboard  becomes  then  the  heap  of  paper.  Alternatively,  a  physical 
blackboard  on  a  wall  may  easily  be  assigned  a  two-dimensional  coordinate 
system  making  explicit  horizontal  and  vertical  distance  from  an  origin;  a 
shape  token  might  correspond  to  a  dot  drawn  on  the  blackboard,  this  token 
expressing  information  only  by  virtue  of  its  location  on  the  board’s  surface. 
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Obviously,  each  scheme  has  its  advantages  and  disadvantages.  The  token- 
as-scraps-of- paper  scheme  permits  each  token  to  mairtain  a  large  number  of 
properties  about  itself,  such  as  location,  orientation,  strength,  time  of  day 
that  it  was  created,  and  so  forth,  but  this  scheme  offers  no  efficient  way  of 
attacking  the  heap  to  find  a  token  possessing  a  given  set  of  properties.  Con¬ 
versely,  the  coordinate-system  scheme  provides  a  handy  means  for  indexing 
information  on  the  basis  of  content — is  there  an  edge  at  location  (4,5)?, 
just  go  there  and  look — but  it  requires  that  the  blackboard  have  as  many 
dimensions  as  independent  pieces  of  information  denoted  by  each  token. 

For  the  present  purposes,  we  adopt  an  intermediate  course:  tape  scraps 
of  paper  to  the  blackboard.  Tokens  are  localized  on  the  blackboard  in  terms 
of  a  coordinate  system  organizing  along  a  few  crucial  properties,  but  each 
token  possesses  internal  state  maintaining  additional  useful  information.  The 
interesting  design  choice  arising  is,  which  information  is  important  enough 
to  merit  its  own  coordinate  dimension  on  the  blackboard? 

In  the  world  of  two-dimensional  shape  objects,  four  leading  candidates 
present  themselves.  These  are,  x  spatial  location ,  y  spatial  location ,  orienta¬ 
tion,  and  scale.  These  are  the  four  geometric  parameters  fixing  an  edge  prim¬ 
itive  in  the  representation:  Where  is  it?,  What  is  its  orientation?,  and  How 
big  is  it?  Because  shape  silhouettes  are  by  definition  two-  dimensional  images, 
x,y  coordinates  are  obvious  choices  for  structuring  the  blackboard.  As  for 
the  other  two  candidates,  Walters  [1987]  has  argued  in  favor  of  rho-space,  in 
which  a  third,  p,  dimension  makes  explicit  the  orientation  of  features,  and 
Witkin  [1983]  suggests  creating  a  scale-space  by  establishing  a  separate  scale 
dimension4. 

Scale-space  segregates  spatial  events  of  different  sizes,  that  is,  it  provides 
a  handle  for  indexing  information  on  the  basis  of  scale.  The  size  of  an  edge 
primitive,  for  example,  is  indicated  by  the  placement,  along  a  separate  scale 
(a)  dimension,  of  a  token  corresponding  to  that  edge.  This  organization 
simplifies  the  sequence  of  operations  required  to  query  a  shape  description 
as  to  whether  certain  properties  are  true  of  the  object  under  observation. 
If  a  pattern  matching  rule  needs  to  know  whether  a  medium  scale  edge  at 
location  (5, 6)  and  orientation  32®  is  present  in  order  to  decide  that  an  object 

4Witkin’s  original  presentation  of  scale-space  dealt  with  the  evolution  across  scales 
of  zero-crossings  of  a  DOG-filtered  one-dimensional  signal,  as  the  width  of  the  Gaussian 
filter  increases.  Here,  we  forbear  zero  crossings  and  instead  refer  only  to  the  use  of  an 
independent  dimension  denoting  size  or  scale. 


has  parallel  sides,  then  under  a  scale-space  organization  it  may  more  rapidly 
narrow  down  the  set  of  tokens  that  must  be  examined  than  if  it  had  to  check 
through  tokens  representing  all  scales.  Depending  upon  the  degree  to  which 
algorithms  for  analyzing  shape  regard  scale  as  an  important  shape  property, 
this  gain  in  efficiency  may  be  as  significant  as  that  obtained  by  ruling  the 
blackboard  with  x,t/  spatial  coordinates. 

Similar  gains  in  "fificiency  may  be  obtainable,  for  some  purposes,  with 
blackboard  organizations  making  explicit  a  separate  orientation  dimension. 
However,  given  the  stated  purpose  of  identifying  the  multiscale  structure  of 
shapes,  and  because  of  the  difficulties  in  managing  high-dimensional  spaces, 
the  present  work  sacrifices  the  possibility  of  indexing  shape  information  di¬ 
rectly  on  the  basis  of  orientation,  and  instead  employs  a  Scale-Space  Black¬ 
board  consisting  of  two  spatial  dimensions  plus  one  scale  dimension. 

3.3  Behavior  of  Scale-Space 

Scale-space  possesses  a  number  of  useful  and  interesting  properties  whose 
examination  clarifies  what  it  means  for  a  shape  event  to  be  “at  a  certain 
scale.”  The  maintenance  of  these  desirable  properties  may  depend  upon  the 
enforcement  of  certain  definitions  and  conventions  over  the  computational 
operations  that  act  upon  the  scale-space  data  structure. 

3.3.1  Self-Similarity  Across  Scales 

The  principle  quality  offered  by  scale-space  is  self-similarity  across  scales 
[Burt  and  Adelson,  1983]:  it  is  most  convenient  that  a  computation  per¬ 
formed  on  any  shape  of  a  given  size  yields  the  same  results  as  the  same 
computation  performed  on  an  identical  shape  that  has  been  uniformly  mag¬ 
nified  (or  reduced)  in  size.  For  example,  the  tests  establishing  whether  four 
line  segments  are  arranged  as  a  square — adjacent  edges  perpendicular,  op¬ 
posite  edges  lie  at  a  distance  equal  to  their  lengths,  ratio  of  diagonal  to  edge 
length  equals  v^2,  and  so  forth — should  be  the  same  no  matter  how  large  or 
small  the  square  is. 

The  most  important  implication  of  the  self-similarity  principle  is  that 
computations  on  scale  space  should  be  defined  so  that  magnifications  in  the 
spatial  dimensions  correlate  with  uniform  translations  in  the  scale  dimen¬ 
sion.  Figure  14  illustrates  in  the  case  of  a  simplified  scale-space  consisting  of 
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Figure  14.  a.  A  one-dimensional  figure  composed  of  two  binary  pulses,  b. 
The  same  figure  magnified  in  the  spatial  dimension  by  a  factor,  m.  Scale- 
space  images  of  these  shapes  are  shown  above.  Each  pulse  is  depicted  as  a 
dot,  and  the  width  of  the  pulse  determines  the  dot’s  placement  along  the 
scale  (a)  dimension.  The  principle  of  self-similarity  across  scales  dictates 
that  when  the  relative  distance  of  shape  features  is  preserved,  their  distance 
along  the  scale  dimension  (A<r)  is  also  preserved. 


a  scale  dimension  and  only  one  spatial  dimension.  Two  shape  features  pos¬ 
sessing  different  sizes  and  spatial  locations  are  represented  as  tokens  placed 
at  different  scales  and  spatial  locations  in  scale  space.  Call  their  proximity 
in  scale  space,  (Ax,  A  a).  Now,  take  the  original  shapes  and  simply  magnify 
the  picture  by  a  factor,  m.  Obviously,  the  r  atures  each  grow  in  size,  and  the 
distance  between  them  increases  by  this  factor,  but,  their  relative  distance 
(distance  relative  to  size)  does  not  change.  Under  the  self-similarity  princi¬ 
ple,  the  scale  space  image  of  this  new  picture  places  tokens  in  proximity  to 
each  other,  (mAx,  A <r);  the  shape  features’  preserved  relative  sizes  becomes 
manifest  as  a  preserved  distance  along  the  scale  dimension. 

In  order  to  enforce  this  property  the  scale  dimension  is  graduated  on  a 
logarithmic  scale  [Witkin,  1983;  Schwartz,  1980).  Consider  a  shape  event, 
for  example,  an  edge  primitive,  occurring  at  some  reference  scale,  a  =  0. 
The  placement  along  the  scale  dimension  of  another  edge  primitive  which  is 
identical  to  the  first,  but  uniformly  magnified  by  a  factor,  m,  is  given  by: 


a  =  A  log  m, 


(3) 


Figure  15.  At  coarse  scales  a  long  smooth  edge  and  a  long  jagged  edge 
appear  identical.  Only  at  finer  scales  do  edge  primitives  obtain  sufficient 
resolution  to  distinguish  smaller  scale  detail. 


where  A  is  a  constant. 

Another  significant  consequence  of  the  self-similarity  principle  is  that  pre- 
cis’on  in  the  specification  of  a  spatial  event’s  spatial  location  depends  upon 
the  scale  of  that  event.  Suppose  that  some  tolerance  is  associated  with  stat¬ 
ing  the  exact  placement,  in  i  and  y,  of  a  token  denoting  a  primitive  edge. 
This  tolerance  region  may  for  convenience  be  considered  equivalent  to  the 
region  of  space  described  by  a  shape  token  (figure  13).  Then  self-similarity 
implies  that  this  tolerance  region  grows  proportionally  with  the  size  of  the 
edge  primitive.  This  is  to  imply  that  a  large  scale  edge  primitive  alone  does 
not  precisely  localize  the  boundary  of  the  shape  object  that  gave  rise  to  it. 

Further  implications  arise  concerning  the  meaning  contained  by  the  as¬ 
sertion  of  a  primitive  shape  event  occurring  “at  scale  er”.  As  illustrated  in 
figure  15,  a  long,  well  defined  edge,  and  a  long  jagged  edge,  appear  at  coarse 
scales  as  identical  in  terms  of  edge  primitives.  It  is  only  when  one  examines 
medium  and  finer  scale  information  that  descriptive  edge  primitives  obtain 
sufficient  precision  to  discriminate  between  these  two  shape  events.  Thus, 
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a  complete  description  of  ever,  a  geometrically  simple  shape  object  must  in¬ 
volve  analysis  of  information  across  a  wide  range  of  scales.  For  example,  the 
description  of  a  long,  straight  contour  boundary,  in  terms  of  tokens  denoting 
edge  primitives  placed  on  a  Scale-Space  Blackboard,  will  be  comprised  of  a 
collection  of  tokens  lying  all  along  the  boundary,  and  at  various  depths  in 
the  scale  dimension. 

The  Scale-Space  Blackboard  leaves  open  the  possibility  of  inventing  more 
complex  types  of  tokens  that  integrate  shape  information  occurring  over  sev¬ 
eral  scales. 

3.3.2  Scale-Normalized  Distance 

The  measurement  of  distance  plays  an  integral  role  in  the  analysis  and  in¬ 
terpretation  of  shape.  In  order  to  conform  to  the  principle  of  self-similarity 
across  scales,  it  is  necessary  that  computations  involving  distance  measure¬ 
ments  among  shape  tokens  in  the  Scale-Space  Blackboard  be  able  to  take  into 
account  the  relationship  between  distance  and  scale.  Just  stating  that  two 
edge  tokens  are  parallel  and  lie  at  2cm  distance  from  one  another  does  not 
complete  the  story,  for  if  they  are  both  fine  scale  tokens  then  they  could  have 
arisen  from  opposite  ends  of  an  object,  while  if  they  are  both  coarse  scale 
tokens  they  must  by  necessity  be  asserting  virtually  the  same  information 
(see  figure  16).  Relative  distance  (distance  relative  to  scale)  is  the  important 
property,  not  actual  distance. 


Figure  16.  Whether  or  not  the  contours  described  by  two  edge  primitive 
tokens  are  fact  the  same  contour  depends  upon  the  tokens'  scales  as  well  as 
their  relative  distance  and  orientation. 
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For  this  reason  we  define  scale-normalized,  distance  with  the  property 
that  the  scale-normalized  distance  between  a  pair  of  tokens  remains  constant 
as  the  configuration  undergoes  uniform  magnification.  By  taking  this  step, 
whenever  computations  take  place  involving  relative  distances  between  shape 
tokens,  scale  is  automatically  taken  into  account.  Some  leeway  is  afforded 
in  the  selection  of  the  scale-normalized  distance  measure.  We  choose  the 
following: 

Definition:  The  Scale  Normalized  Distance  (sn-distance)  between  two 
tokens  occurring  at  scales  <j\  and  (r2,  respectively,  and  separated  by  a  distance 
D,  is  given  by 

“D=ipr^y  <4> 

The  justification  for  this  definition  is  as  follows:  If  a  unit  distance  is 
measured  at  scale  a  =  0,  then  this  distance  is  magnified  at  scale  cr  by  a 
factor,  e *  (inverse  of  equation  (3)).  Sn-distance  adjusts  for  the  scale  of  two 
tokens  by  dividing  the  spatial  distance  between  them  by  the  average  of  their 
associated  magnification  factors. 

It  is  instructive  to  consider  the  behavior  of  the  sn-distance  between  two 
tokens  occurring  at  different  scales.  Imagine  three  tokens,  A,  B,  and  C, 
positioned  colinearly  and  as  shown  in  figure  17.  Their  pairwise  distances 
obey  the  relationship, 

D{A,B)  +  D{B,C)  =  D{A,C)  (5) 

When  the  tokens  all  occur  at  the  same  scale,  their  pairwise  scale-normalized 
distances  also  obey  this  relationship: 

'nD(d,5)  +  snD(5,C')  =  9nD(/l,C)  (6) 

But  consider  what  happens  when  token  B  increases  in  scale.  Then,  by  equa¬ 
tion  (4),  the  sn-distances  distances  between  tokens  A  and  B,  and  between 
tokens  B  and  C  decrease,  while  the  sn-distance  between  tokens  A  and  C 
remains  unchanged.  In  general,  the  laws  of  Euclidian  distances  as  expressed 
by  equation  (6)  do  not  hold  for  scale- normalized  distance. 
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Figure  17.  a  Wh<  a  colincar  tokens  occur  at  the  same  scale,  then  scale- 
normalized  distances  behave  according  to  the  law,  snD(  A*B)  +9nD(  B.  C')  - 
snD(.4.0).  b.  However,  when  token  B  is  moved  to  a  coarser  scale  this 
relationship  no  longer  holds. 


3.3.3  Quantization  and  Sampling 

The  x-y-a  Scale-Space  Blackboard  data  structure  permits  algorithms  to  in¬ 
dex  into  a  shape  description  on  the  basis  of  spatial  location  and  scale.  This  is 
conceptually  a  continuous  space.  However,  for  purposes  of  implementing  the 
Scale-Space  Blackboard  on  a  computer,  it  becomes  necessary  to  quantize  the 
space  so  that,  for  example,  points  in  scale-space  may  be  assigned  to  elements 
of  an  array.  As  a  purely  practical  matter,  how  might  we  go  about  tesselating 
scale-space? 

First,  note  that  as  long  as  shape  tokens  behave  as  scraps  of  paper  on 
which  may  be  written  down  any  information  desired,  then  an  appropriate 
strategy  is  to  include  among  this  list  of  properties  a  token’s  pose  in  scale- 
space  (spatial  location,  orientation  and  scale).  Computations  involving  a 
token’s  pose  should  use  this  information  rather  than  the  quantized  array 
indices  specifying  the  token’s  address  in  the  Scale-Space  Blackboard.  This 
tactic  ensures  that  whatever  array  quantization  scheme  is  used,  its  effects 
may  be  confined  to  the  efficiency  of  computation  but  not  the  results. 

The  array  quantization  issue  separates  into  two:  quantization  along  the 
spatial  coordinates,  and  quantization  along  the  scale  coordinate.  Quantiza¬ 
tion  of  the  scale  coordinate  will  depend  in  part  on  how  closely  spaced  along 
the  scale  dimension  two  different  shape  tokens,  specifying  different  proper- 
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Figure  18.  At  a  given  spatial  location,  the  jagged  contour  can  give  rise  to 
edge  primitives  with  different  orientations  at  different  scales. 


ties,  yet  occurring  at  the  same  spatial  location,  might  be  placed.  To  illustrate 
the  question  more  clearly,  figure  18  shows  a  figure  whose  local  orientation 
at  a  coarse  scale  is  quite  different  from  its  local  orientation  measured  at  a 
fine  scale.  Over  how  small  a  distance  in  the  scale  dimension  might  such  a 
phenomenon  occur?  We  present  no  theoretical  analysis  but  simply  relate 
empirical  experience  suggesting  that  a  magnification  of  about  a  factor  of  two 
(one  octave)  characterizes  the  rapidity  with  which  the  information  asserted 
at  one  scale  can  differ  from  the  information  asserted  at  another  scale.  Thus, 
scale  quantization  at  steps  in  the  neighborhood  one  octave  or  slightly  less 
seem  about  right. 

As  for  the  spatial  dimensions,  coordinate  quantization  should  accord  with 
the  purposes  of  the  algorithms  that  consult  the  Scale-Space  Blackboard.  One 
of  the  most  common  operations  is  likely  to  be  a  query  of  the  form,  “Is  there 
a  token  at  pose  PT' .  The  purpose  in  making  this  query  is  of  course  really  to 
discover  whether  the  shape  object  under  analysis  displays  some  spatial  event 
such  as  an  edge  at  pose  P,  under  the  assumption  that  this  spatial  event  will 
be  represented  by  a  token  (or  tokens)  in  the  Scale-Space  Blackboard.  It  would 
therefore  seem  reasonable  to  choose  a  tesselation  size  in  the  neighborhood 
of  the  range  of  poses  that  a  token  might  take  in  describing  a  given  single 
localized  spatial  event,  i.e.  choose  array  bin  sizes  to  cover  about  the  same 
spatial  extent  as  the  spatial  localization  tolerance  of  a  shape  primitive  (figure 
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Figure  19.  A  stack  of  two-dimensional  arrays  for  implementing  the  scale- 
space  blackboard.  Each  array  bin  holds  a  list  of  tokens  falling  within  its 
domain  of  scale-space.  Coarser  tesselation  at  coarser  scales  gives  resemblance 
to  a  pyramid  data  structure. 


Note  that  individual  elements  or  bins  in  the  array  maintaining  the  con¬ 
tents  of  the  Scale-Space  Blackboard  may  contain  not  just  one  but  several 
tokens.  Note  also  that  appropriate  spatial  quantization  changes  with  scale, 
so  that  many  fewer  array  elements  need  be  provided  per  unit  area  at  coarse 
scales  than  at  fine  scales.  A  suitable  picture  is  of  a  collection  of  two- 
dimensional  arrays  stacked  at  octave  distances  along  the  scale  dimension, 
as  shown  in  figure  19.  This  data  structure  closely  parallels  pyramid  style  im¬ 
age  representations  [Sammet  and  Rosenfeld,  1980;  Burt  and  Adelson,  1983]. 


4  Multiscale  Description  by  Fine-to-Coarse  Aggrega¬ 
tion 

We  are  now  equipped  to  offer  a  procedure  for  building  a  multiscale  shape 
description  one  scale  at  a  time,  from  fine  scales  to  coarse.  A  shape  is  at  this 
early  stage  described  in  terms  of  edge  primitives  possessing  the  attributes 
of  location,  orientation,  scale,  and  strength.  A  token’s  strength  attribute 
indicates  something  like  “how  good”  an  edge  is  present  at  the  token’s  pose. 
The  objective  for  the  fine-to-coarse  aggregation  procedure  is  to  place  “good” 
edges  at  successively  coarser  scales,  starting  with  primitive  edge  tokens  placed 
at  intervals  along  the  shape  object’s  boundary  contour  at  some  initial  (finest) 
scale.  The  aggregation  procedure  iterates,  proceeding  from  fine  scales  to 
coarse,  until  a  desired  coarseness  of  description  is  reached. 

The  design  of  a  fine-to-coarse  aggregation  procedure  is  motivated  by  con¬ 
sidering  configurations  of  edge  primitives  that  give  rise  to  good  coarser  scale 
edges.  A  sampling  of  prototypical  situations  is  presented  in  figure  20. 

Figure  20a  is  the  simplest  case.  A  collection  of  finer  scale  edges  that  align 
with  one  another  give  rise  straightforwardly  to  a  coarser  scale  edge.  Note  in 


Figure  20.  Configurations  of  finer  scale  edge  primitives  (solid  ellipses)  sup¬ 
porting  assertions  of  edge  primitives  one  octave  coarser  in  scale  (dashed 
ellipses). 


this  figure  that  the  portion  of  the  image  that  a  given  edge  token  describes 
may  overlap  with  that  of  other  edge  tokens.  The  spacing  of  primitive  edge 
assertions  along  a  contour  is  a  free  parameter  of  the  representation.  For 
reasons  elaborated  below,  we  find  it  useful  for  one  edge  primitive  to  overlap 
the  next  by  about  50%  of  its  length. 

Figure  20b  shows  that  a  section  of  curved  contour  gives  rise  to  edge  to¬ 
kens  very  well  aligned  with  one  another  at  fine  scales,  but  with  increasing 
orientation  difference  at  coarser  scales.  We  suggest  that  coarser  scale  prim¬ 
itive  edges  associated  with  curved  contours  be  considered  weaker  than  edge 
primitives  associated  with  straight  contours,  in  much  the  same  way  that  a 
coarse  scale  oriented  edge  filter  would  give  a  weaker  response  to  a  curved 
contour  than  to  a  straight  edge. 

Figure  20c  illustrates  that  a  broken  contour  appearing  at  a  fine  scale  as 
two  aligned  yet  disparate  portions  of  a  shape  may  nevertheless  be  described 
by  a  single  edge  primitive  at  a  coarser  scale.  This  is  to  say,  the  pattern 
matching  methods  deciding  where  coarse  scale  edges  are  to  be  placed  must 
be  able  to  identify  pairs  of  finer  scale  edges  aligning  with  one  another  across 
a  gap  or  protrusion. 

Finally,  20d  shows  that,  when  appropriately  configured,  a  collection  of 
fine  scale  edges  may  individually  have  very  different  orientations  from  the 
coarser  scale  edge  that  the  collection  generates.  The  algorithm  described  in 
this  paper  omits  explicit  consideration  of  this  type  of  situation. 

4.1  Fine-to-Coarse  Aggregation  Procedure 

The  basic  step  of  the  fine  to  coarse  aggregation  procedure  takes  as  input  a 
set  of  primitive  edge  tokens  occurring  at  a  single  scale,  <7,,  in  the  Scale-Space 
Blackboard,  and  it  returns  a  set  of  new  edge  primitives  at  scale  erc.  Let  us 
refer  to  scale  cr,  as  the  current  “input”  scale,  and  scale  ac  as  the  “coarser” 
scale.  As  implemented,  the  new  tokens  delivered  are  one  octave  coarser  in 
scale  than  the  input  tokens,  though  the  algorithm  does  not  depend  upon  this 
rate  of  aggregation.  The  basic  step  proceeds  in  four  smaller  steps: 

I.  Identify  seed  poses  for  new  coarser  scale  tokens. 

II.  Starting  from  the  seeds,  refine  the  placement  of  new  coarser  scale  tokens 
based  on  primitive  edge  tokens  occurring  at  the  input  scale. 


III.  Determine  the  strengths  of  these  coarser  scale  tokens. 

IV.  Prune  redundant  coarser  scale  tokens. 

These  steps  are  discussed  in  turn. 

4.1.1  Step  I.  Identify  Seed  Poses  for  Coarser  Scale  Tokens 

A  seed  pose  is  an  initial  guess  as  to  where  a  coarser  scale  token  might  be  well 
placed.  Observing  figure  20,  we  introduce  seed  poses  at  every  primitive  edge 
token  at  the  input  scale,  and  at  locations  where  two  primitive  edge  tokens 
approximately  align  with  one  another  across  an  sn-distance  (scale-normalized 
distance)  approximately  equal  to  the  twice  the  length  of  a  token.  Call  the 
latter  case,  “gap-jumping”  seeds.  The  orientation  of  a  gap-jumping  seed  is 
taken  to  be  the  average  orientation  of  the  two  input  tokens  that  gave  rise  to 
it. 

The  detection  of  gap-jumping  seeds  requires  checking  of  input  tokens 
pairwise  to  determine  whether  or  not  they  fulfill  the  seeding  qualifications,  i.e. 
proper  distance  and  alignment  (and  no  other  token  aligned  in  between).  This 
operation  is  assisted  enormously  by  the  spatial  and  scale  indexing  provided 
by  the  Scale-Space  Blackboard,  as  this  data  structure  greatly  facilitates  the 
inspection  of  only  tokens  lying  within  some  spatial  neighborhood. 

4.1.2  Step  II.  Refine  the  Placement  of  Coarser  Scale  Tokens 

The  second  step  is,  for  each  seed,  to  determine  the  best  pose  for  a  new  coarser 
scale  token  suggested  by  this  seed.  Selecting  the  “best  pose”  originating  from 
a  given  seed  involves  finding  a  pose  that  tends  to  maximize  the  strength  of 
the  resulting  coarser  scale  token  while  tethering  the  new  pose  so  that  it  still 
“belongs”  to  the  seed. 

The  general  approach  of  the  fine-to-coarse  grouping  procedure  is  that  a 
coarser  scale  description  is  to  be  aggregated  from  the  information  contained 
in  the  finer  scales.  Accordingly,  the  algorithm  computes  a  coarser  scale  to¬ 
ken’s  pose  as  a  weighted  average  of  pose  information  over  some  support  set 
of  input  tokens  in  the  neighborhood  of  the  seed  (see  figure  21).  A  question 
immediately  arises  as  to  how  each  supporting  input  token  associated  with 
a  given  new  coarser  scale  token  is  to  be  weighted  relative  to  the  other  sup¬ 
porting  tokens.  The  factors  influencing  this  weighting  are:  (1)  the  spatial 
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Figure  21.  A  token  at  scale  rrc  is  placed  by  laking  a  weighted  average  of 
information  contained  in  a  set  of  support  tokens  occurring  at  scale  crl. 
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relationship  between  the  seed  pose  and  the  pose  of  the  supporting  input  scale 
token,  (2)  the  proximity  of  other  nearby,  possibly  redundant,  supporting  in¬ 
put  scale  tokens,  and  (3)  this  supporting  input  scale  token’s  strength.  These 
factors  are  dealt  with  as  follows: 


1.  Spatial  relationship  between  seed  pose  and  supporting  input 
scale  token.  Figure  22a  shows  several  possible  configurations  among  a 
seed  pose  and  the  pose  of  an  input-scale  token  that  will  have  some  influence 
on  the  placement  of  a  new,  coarser  scale  token  initially  placed  at  the  seed 
pose.  How  should  this  influence,  or  weight,  be  assigned,  say,  as  a  number 
between  0  (low  influence)  and  1  (high  influence)?  From  figure  22  we  reason 
that  influence  should:  (1)  decrease  with  distance  from  the  seed  pose,  (2) 
decrease  with  distance  faster  across  the  orientation  of  the  seed  pose  than 
along  its  orientation,  (3)  decrease  as  the  relative  orientation  of  the  seed  pose 
and  the  supporting  token  differ,  but  (4)  less  so  as  their  sn-distance  decreases. 
These  factors  translate  into  the  following  expression  for  calculating  the  raw- 
influence-weight,  W(,  of  a  token,  T,,  occurring  at  scale  cr,,  on  the  pose  of  a 
token,  Tc,  at  the  next  scale,  crc  which  has  been  initially  placed  at  its  seed 


b 


■u 


c 


Figure  22.  a.  A  number  of  possible  spatial  relationships  between  a  coarser 
scale  token  placed  at  its  seed  pose  (larger  line  segment)  and  one  of  its  sup¬ 
porting  finer  scale  tokens  (shorter  line  segment).  The  supporting  token's 
influence  is  considered  greater  when  it  is  near  to  and  aligned  with  the  seed 
pose.  b.  The  distance,  D.  and  angle,  y.  entering  into  the  Gaussian  weighting 
ellipsoid,  G'(9nD  ,  <2>Cl,),  shown  in  c. 

pose: 

Wl  «-  G(8nD,  <£c,i)[l  -  min(l,  B  8nDp)|  sin  A0J],  (7) 

where  8nD  is  the  sn-distance  between  the  seed  and  the  supporting  input 
scale  token,  <f>c%i  is  the  direction  from  token  Tc  to  token  T;,  A 0Cii  is  their 
relative  orientation,  and  G(D,0)  is  an  ellipsoidal  two-dimensional  Gaussian 
weighting  function  with  major  axis  aligned  with  <j>  =  0  (see  figures  22b  and  c). 
B  and  p  are  positive  constants.  The  ellipsoidal  Gaussian  weighting  function 
has  maximum  value  1  when  G  =  0,  and  it  trails  off  to  0  at  infinity.  This 
ellipsoid’s  aspect  ratio  is  a  free  parameter,  for  which  the  value  4  :  1  has  been 
found  to  serve  acceptably.  The  term  in  brackets  drops  below  1  only  when 
tokens  are  relatively  distant  and  have  substantially  different  orientations. 
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Figure  23.  The  two  smaller  scale  support  tokens  supply  redundant  pose 
information. 


2.  The  proximity  of  nearby,  possibly  redundant,  supporting  input 
scale  tokens.  Figure  23  presents  a  situation  in  which  two  input  scale  to¬ 
kens  are  very  near  to  one  another,  and  would  contribute  similar  influence  on 
the  pose  of  a  coarser  scale  token  initiated  at  the  seed  pose  shown.  The  in¬ 
formation  that  these  two  tokens  offer  about  the  underlying  finer  scale  shape 
is  redundant,  and  these  two  tokens  should  not  both  share  equal  weight  with 
other  tokens  providing  very  different  information.  Some  scheme  is  required 
causing  the  information  from  input  tokens  located  very  near  one  another  to 
saturate  in  their  collective  influence  upon  the  pose  of  the  coarser  scale  token 
under  construction.  This  effect  is  achieved  by  the  following  procedure: 

I.  Sort  supporting  input  tokens  by  decreasing  raw-influence-weight ,  W'. 

II.  For  input  token  71,,  identify  the  supporting  input  token,  Tj,  that:  1. 
has  greater  or  equal  raw-influence-weight ,  and  2.  is  most  similar  in 
pose.  Pose  similarity,  L ,  may  be  estimated  by  the  following  expression: 

L(T„  Tj)  =  G(8nD,  <M  cos  Mi<3  (8) 

III.  Choose  the  value  of  the  modified-influence-weight,  W ",  for  token  T,  in 
such  a  manner  that  it  decreases  according  to  its  degree  of  similarity  to 
its  most  similar  stronger  neighbor,  Tj’. 

Wl'i-  W'fll  -  L{T,Tj))  (9) 


3.  Strength  of  this  supporting  input  scale  token.  The  influence- 
weight  of  a  supporting  input  scale  token  on  the  pose  of  a  coarser  scale  token 
should  be  proportional  to  the  primitive  edge  strength,  Si,  of  that  input  token. 


Thus,  finally,  the  influence-weight ,  Wt,  of  an  input  scale  token  7j  on  a  given 
coarser  scale  token  is  expressed  by 

Wiir-SiW”  (10) 

Once  the  influence-weights  of  all  of  its  supporting  input  scale  tokens  have 
been  established,  then  the  pose  of  each  new  coarser  scale  token  may  be  deter¬ 
mined.  The  new  token’s  ( x,y )  location  can  simply  be  taken  as  the  weighted 
average  of  the  ( x,y )  locations  of  supporting  tokens,  and  its  orientation  as 
that  providing  best  alignment  with  the  locations  of  the  supporting  tokens,  in 
the  least-squares  sense.  If  desired,  it  is  possible  to  devise  formulas  assigning 
the  coarse  scale  token’s  orientation  on  the  basis  of  the  aggregate  orientations 
of  the  supporting  tokens  as  well  as  their  locations. 

4.1.3  Step  III.  Determine  Coarser  Scale  Token  Strength 

Under  the  Scale-Space  Blackboard  representation,  the  qualitative  presence 
or  absence  of  a  descriptive  token  such  as,  for  example,  an  edge  primitive, 
is  to  be  modulated  with  an  indication  of  how  strongly  the  token  asserts 
that  its  attribute  is  actually  present,  at  a  corresponding  pose,  in  the  shape 
object  under  observation.  This  is  the  token’s  strength  parameter.  Every 
seed  generated  in  step  I  leads  to  the  placement  of  a  coarser  scale  shape 
token  in  step  II.  However,  some  of  these  coarser  scale  tokens  represent  better 
primitive  edges  than  others.  Figure  24  presents  a  few  examples  of  situations 
in  which  the  assertion  of  a  coarser  scale  edge  is  more  strongly  or  more  weakly 
supported  by  the  finer  scale  edges  present.  Step  III  assigns  a  strength,  5. 
0  <  S  <  1,  to  every  newly  created  coarser  scale  primitive  edge  token. 

Reasoning  from  the  examples  in  figure  24,  a  coarser  scale  edge  is  strongly 
supported  when  finer  scale  edges  are  aligned  all  along  its  length.  Strength 
decreases  when:  (1)  the  orientations  of  supporting  finer  scale  edges  deviate 
from  that  of  the  coarser  scale  edge,  and  when  (2)  supporting  tokens  fail  to 
span  its  entire  length.  A  mathematical  expression  reflecting  these  criteria  is: 

S  <-  min{l,  [min(VJum,C)  +  min( U/ront, C)  +  min(  Ureor,  C)]},  (11) 

where  C  is  a  positive  constant.  V,um  is  a  sum  over  all  supporting  tokens,  7j, 
of  each  supporting  token’s  contribution  to  the  strength  of  the  new  coarser 
scale  token. 


(12) 


Figure  24.  A  coarser  scale  token  is  assigned  a  strength  according  to  whether 
finer  scale  tokens  are  aligned  with  it  all  along  its  length.  The  situation  in  a. 
receives  greater  strength  than  in  h.,  c.,  or  d. 


V{  =  wf  cos7  A0C,<,  (13) 

where  p  and  q  are  positive  constants,  and  A0  is  the  difference  between  the 
orientation  of  the  coarse  scale  token  and  that  of  the  supporting  finer  scale 
token,  Xj.  The  use  of  the  influence-weight ,  VE,,  ensures  that  redundant  sup¬ 
porting  tokens  do  not  unduly  influence  the  strength  computation.  The  terms, 
Vjront  and  VTeaT  in  equation  (11),  weigh  support  at  the  two  ends  of  the  coarser 
scale  edge,  as  follows: 


V/ron*  =  £  Vj|“Dpro,|  (14) 

1  front 

VTear  =  £  V,|8nDpro;|  (15) 

'rear 

3nDpr0j  is  the  scale-normalized  distance  between  supporting  token  Tt  and 
the  new  coarse  scale  token,  projected  onto  the  length  axis  of  the  coarse  scale 
token  (see  figure  25).  Equation  (11)  is  constructed  so  that  in  order  for  a  token 
to  receive  a  maximum  strength  of  1 ,  it  must  receive  substantial  support  along 
its  entire  length. 


Figure  25.  Dproj  is  the  distance  from  a  token  to  a  reference  token,  projected 
onto  the  reference  token's  length  axis. 


4.1.4  Step  IV.  Subsample  the  Coarser  Scale  Description 

By  the  principle  of  self-similarity,  coarser  scale  edge  primitives  describe  larger 
portions  of  a  shape  image  than  do  edge  primitives  occurring  at  finer  scales. 
Also,  they  are  proportionately  less  precise  in  specifying  absolute  spatial  lo¬ 
cation.  Therefore,  the  coarse  scale  description  of  a  shape  employs  tokens 
more  sparsely  distributed  across  the  shape  image  than  does  a  fine  scale  de¬ 
scription.  This  is  analogous  to  the  case  in  signal  processing,  in  which  the 
sampling  required  to  reconstruct  a  signal  depends  upon  its  bandwidth. 

The  procedure  for  generating  coarse  scale  tokens  creates  a  new  token  at 
every  seeded  location.  When  the  jump  in  scale  is  one  octave,  approximately 
twice  as  many  coarse  scale  tokens  are  generated  as  are  necessary.  While  this 
should  not  be  harmful  to  later  computations  for  any  fundamental  reasons,  it 
is  wasteful,  and  it  adversely  affects  the  perspicuity  of  the  coarse  scale  shape 
description.  For  this  reason  the  fourth  step  in  the  fine-to-coarse  aggregation 
procedure  is  to  prune  the  coarse  scale  shape  description  so  that  tokens  overlap 
one  another  by  approximately  50%  of  their  length. 

The  design  of  a  procedure  for  subsampling  the  coarser  scale  description 
follows  three  guidelines:  (1)  prune  tokens  of  weaker  strength  first,  (2)  prune 
a  token  lying  very  near  another  token  in  location  and  orientation,  (3)  prune 
a  token  closely  sandwiched  between  and  aligned  with  two  other  tokens.  See 
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Figure  26.  Tokens  ;tre  pruned,  weakest  first.  when  they:  a.  lie  very  near  in 
pose  to  another  token,  or  b.  are  sandwiched  between  other  tokens. 


figure  26.  A  satisfactory  algorithm  is  the  following: 

I.  Sort  tokens  by  decreasing  strength,  S. 

II.  In  three  passes  through  the  sorted  list  of  all  tokens,  remove  tokens 
falling  under  criteria  2.  and  3. 

The  three  passes  are  taken  with  increasingly  stringent  bounds  on  how  near 
to  another  token  a  given  token  may  not  be.  Taking  several  increasingly 
severe  passes  has  been  found  helpful  in  ensuring  that  weaker  tokens  which 
may  perhaps  yet  describe  important  nuances  in  shape  are  not  prematurely 
stomped  out  by  stronger  tokens. 

4.2  Results 

Performance  of  the  fine  to  coarse  edge  primitive  aggregation  procedure  is 
illustrated  in  figures  27  though  30.  As  seen  in  figure  27,  the  coarse  scale 
description  of  the  apple  survives  well  even  when  the  contour  is  interrupted 
by  the  protrusion  of  a  string  (figure  27d),  and  when  other  large  objects  are  in 
proximity  (figure  27b).  In  figure  27c,  when  the  banana  moves  close  enough 
to  occlude  part  of  the  apple’s  contour,  much  of  the  apple’s  boundary  in  the 
vicinity  of  the  banana  is  nonetheless  detected  at  coarser  scales. 

Figure  28  helps  to  illustrate  the  fact  that  as  scale  increases,  primitive 
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edge  tokens  demark  figure/ground  boundaries  of  decreasing  spatial  resolu¬ 
tion.  This  figure  depicts  grey-level  images  “reconstructed”  from  the  tokens 
residing  in  each  of  six  slices  of  the  Scale-Space  Blackboard.  For  each  token, 
a  lightened  region  (figure)  and  a  darkened  region  (ground)  were  colored  into 
an  8-bit  image  on  either  side  of  each  token.  For  convenience,  the  light/dark 
colored  region  for  each  token  takes  the  form  of  the  oriented  filter  mask  shown 
in  figure  8.  As  the  pseudo-blurred  images  show,  at  coarser  scales  the  prim¬ 
itive  edge  information  describes  figure/ground  boundaries  of  greater  spatial 
extent  while  smaller  details  of  the  object’s  boundary  are  smoothed  over. 

In  order  to  illustrate  the  significance  of  a  token’s  strength  parameter, 
figure  29  displays  edge  tokens  at  three  scales  using  three  different  thresholds 
on  token  strength.  As  may  be  observed,  coarser  scale  edges  that  bridge  gaps 
and  cut  corners  are  assigned  lesser  strength  than  edges  falling  along  a  line  of 
smaller  scale  edges. 

Figure  30  shows  a  situation  in  which  the  aggregation  procedure  fails  to 
identify  coarse  scale  structure.  Note  that  the  smooth  pear  and  rippled  pear 
give  rise  to  nearly  identical  coarse  scale  descriptions.  However,  when  the  con¬ 
tour  texture  of  the  pear  is  extremely  jagged,  finer  scale  edge  tokens  lie  nearly 
perpendicular  to  the  large  scale  figure/ground  boundary,  and  are  not  success¬ 
fully  grouped  into  coarse  scale  tokens  falling  along  the  boundary.  Detection 
of  this  sort  of  contour  may  be  addressed  by  the  development  of  additional 
grouping  rules,  or  else  by  some  form  of  numeric  smoothing  operation. 

We  have  shown  that  symbolic  processes  operating  on  collections  tokens 
in  a  Scale-Space  Blackboard  are  able  in  most  cases  to  constiuct  successively 
coarser  shape  descriptions  in  terms  of  a  simple  vocabulary  in  which  tokens 
denote  edge  primitives.  The  Scale-Space  Blackboard  also  supports  other 
interesting  grouping  operations  making  explicit  more  complex  shape  entities. 

5  Pairwise  Grouping  of  Edge  Primitives 

Symbolic  tokens  denoting  edge  primitives  are  extremely  simple,  possessing 
only  the  attributes  of  pose  (location,  orientation,  and  scale)  and  strength. 
Let  us  refer  to  these  as  Type  0  tokens.  This  section  introduces  another  class 
of  shape  token,  called  Type  1  tokens,  possessing  one  additional  parameter  of 
internal  state.  Type  1  tokens  are  constructed  from  pairs  of  Type  0  tokens. 
The  spatial  configurations  (Type  1  configurations )  subsumed  by  this  class 
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of  tokens  form  a  continuum  which  includes  shapes  that  might  be  called, 
“curved  contour  segments,”  “primitive-corners,”  ^nd  “bars.”  These  terms 
are  elaborated  below.  In  analogy  to  the  fine-to-coarse  aggregation  procedure, 
we  construct  pattern  matching  procedures  to  identify  Type  1  configurations 
occurring  in  the  Scale-Space  Blackboard,  and  then  mark  these  occurrences 
by  placing  Type  1  tokens  appropriately. 

5.1  Definition  of  Type  1  Configurations 

Two  tokens  in  scale-space  are  spatially  related  to  one  another  by  four  num¬ 
bers.  These  numbers  must  collectively  specify  the  tokens’  relative  x  and  y 
location,  relative  orientation,  and  relative  scale.  Type  1  tokens  possess  one 
internal  parameter  whose  range  generates  a  one-dimensional  family  of  con¬ 
figurations,  in  other  words,  a  one- dimensional  constraint-curve  in  the  four¬ 
dimensional  space  of  a  pair  of  Type  0  tokens’  relative  configuration  (see 
[Saund,  1987]).  The  definition  for  Type  1  tokens  must  therefore  constrain  or 
otherwise  account  for  three  remaining  degrees  of  freedom. 

Type  1  configurations  are  defined  by  specifying  three  constraints  on  the 
relative  poses  of  the  two  component  Type  0  tokens:  (1)  The  Type  0  to¬ 
kens  must  occur  at  the  same  scale,  (2)  The  Type  0  tokens  must  be  sym¬ 
metrically  placed,  (3)  The  Type  0  tokens  must  lie  at  a  fixed,  prespecified, 
scale-normalized  distance  from  one  another. 

The  first  condition,  that  two  Type  0  tokens  satisfying  a  Type  1  configu¬ 
ration  must  occur  at  the  same  scale,  is  straightforward. 

The  second  requirement  states  that  a  Type  1  configuration  must  be  com¬ 
prised  of  Type  0  tokens  that  are  symmetrically  placed.  This  condition  is 
illustrated  in  figure  31;  the  relative  orientations  between  each  token  and 
the  line  segment  joining  them  must  be  equal.  This  specification  of  angular 
equality  lies  behind  the  definition  of  the  Smoothed  Local  Symmetries  shape 
representation  [Brady  and  Asada,  1984;  Connel,  1985,  Fleck,  1985],  and  has 
also  been  called  “co-circularity”  by  Parent  and  Zucker  [1985]. 

Strictly  speaking  the  first  two  conditions  allow  no  tolerance  for  the  tokens 
to  differ  in  scale  or  to  deviate  from  symmetrical  placement  by  even  a  slight 
amount.  Obviously,  some  tolerance  is  desirable.  A  potential  question  arising 
is  then,  how  much  tolerance  is  acceptable?  We  handle  this  question  by 
appealing  to  a  token’s  strength  parameter.  The  closer  to  identical  scale  and 
perfectly  symmetrical  alignment  a  pair  of  Type  0  tokens  are  placed,  the  closer 
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Figure  31.  Constraints  on  the  spatial  relationship  of  a  pair  of  Type  0  to¬ 
kens  (edge  primitives)  if  they  are  to  satisfy  the  Type  1  configuration  con¬ 
ditions:  a.  symmetric  placement  (co-cin  ularitv )  b.  fixed,  predetermined 
scale-normalized  distance.  An  additional  condition  is  that  the  Type  0  to¬ 
kens  must  occur  at  the  same  scale. 


to  1  can  be  the  strength  of  the  Type  1  token  naming  the  pair.  As  the  Type 
0  tokens  stray,  the  Type  1  token  strength  must  drop  to  0. 

The  third  condition  suggests  that  two  Type  0  tokens  satisfying  the  con¬ 
ditions  of  a  Type  1  configuration  must  lie  at  a  characteristic  predefined  sn- 
distance,  8nD(arset,  from  one  another.  See  figure  31.  Now,  a  pair  of  Type  0 
tokens  may  certainly  lie  at  virtually  any  (true)  distance  from  one  another,  de¬ 
pending  upon  the  geometry  of  the  shape  object  giving  rise  to  it.  By  equation 
(4),  a  given  true  distance  (D)  corresponds  to  another  given  scale-normalized 
distance  (for  example,  8nD,arje,)  only  at  one  particular  scale.  However,  the 
fine-to-coarse  aggregation  procedure  places  Type  0  tokens  only  at  octave  in¬ 
tervals  in  the  scale  dimension.  We  cannot  guarantee  that  Type  0  tokens  will 
have  been  placed  precisely  where  needed  along  the  scale  dimension  in  order 
to  satisfy  condition  3  of  the  definition  of  a  Type  1  configuration. 

The  resolution  to  this  matter  is  to  note  that  a  shape  description  does  not 
change  rapidly  across  scales.  In  other  words,  the  orientation  and  strength 
attributes  computed  for  a  primitive  edge  token  at  one  scale  would  be  almost 
identical  to  those  of  a  primitive  edge  positioned  at  a  closely  nearby  scale. 
Therefore  it  is  fair  to  adopt  the  following  tactic:  pretend  that  a  Type  0  token 
placed  at  a  given  scale  generates  a  virtual  set  of  Type  0  tokens  possessing 
the  same  (x,  y)  location  and  orientation,  but  placed  at  all  surrounding  scales 
within,  say,  a  one-half  octave  range.  Then,  Type  1  grouping  takes  place  on 
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just  the  pair  of  virtual  tokens  required  to  satisfy  condition  3.  The  resolution 
amounts  to  this:  place  a  Type  1  token  in  scale-space  at  a  scale  coordinate 
depending  upon  the  measured  sn-distance  between  the  two  component  Type 
0  tokens.  Specifically, 

snDn 

<7X1  =  ^TO  +  a  log  — - — — ,  (16) 

U  target 

where  <tti  is  the  placement  of  the  Type  1  token  along  the  scale  dimension, 
<jto  and  snDT0  are  respectively  the  scale  of  and  scale- normalized  distance 
between  the  constituent  Type  0  tokens,  and  8nDtarset  is  the  characteristic 
sn-distance  defined  for  the  Type  1  configuration. 

5.2  The  Class  of  Type  1  Configurations 

The  internal  parameter  of  a  Type  1  token  makes  explicit  one  remaining  degree 
of  freedom  in  the  spatial  configuration  of  two  Type  0  tokens.  This  degree  of 
freedom  is  equivalent  to  the  relative  orientation  of  the  Type  0  tokens.  Figure 
32  illustrates  the  range  of  configurations  generated  as  this  paramei  ■”  varies. 
Intuitive  interpretations  of  several  of  these  shapes  come  readily  to  mind. 
When  the  Type  0  tokens’  orientations  are  roughly  aligned,  the  parameter 
makes  explicit  the  local  curvature  of  a  curved-contour  segment.  When  the 
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Figure  32.  Members  of  the  class  of  Type  L  configurations.  Each  member 
defines  the  open  boundary  of  a  partial-region. 
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relative  orientation  is  more  or  less  90°,  the  parameter  describes  the  vertex 
angle  of  a  primitive- comer.5  Finally,  when  the  Type  0  tokens  are  oriented 
approximately  180°  with  respect  to  one  another,  the  parameter  describes  the 
taper  of  a  bar.  Bars,  primitive-corners  and  to  a  lesser  extent,  curved-contours 
demark  local  partial-regions ,  as  shown  by  the  shaded  areas  in  figure  32.  Note 
that  the  Type  1  parameter  may  take  either  positive  or  negative  values.  Pa¬ 
rameter  values  of  opposite  sign  are  related  by  reversal  of  the  figure/ground 
relationship. 

Computation  of  Type  1  tokens  from  Type  0  tokens  is  quite  straightfor¬ 
ward.  Pairs  of  Type  0  tokens  satisfying  the  three  criteria  are  easily  found 
by  virtue  of  the  spatial  indexing  and  scale  indexing  afforded  by  the  Scale- 
Space  Blackboard  data  structure.  Wherever  a  Type  1  configuration  is  found, 
a  Type  1  token  is  placed  at  some  suitable  pose  on  the  Blackboard,  such  as 
midway  between  the  constituent  Type  0  tokens. 

5.3  Results 

Figures  33  through  35  present  the  results  of  Type  1  token  grouping  for  several 
shape  objects.  Each  Type  1  token  is  displayed  as  a  line  segment  placed  at 
the  token’s  pose  in  the  image,  with  a  small  circle  at  one  end  indicating  its 
orientation.  In  addition,  the  two  Type  0  tokens  supporting  this  Type  1  token 
are  also  drawn.  For  clarity,  those  Type  1  tokens  are  omitted  which  describe  a 
gently  curved  section  of  contour;  only  primitive-corners  and  bars  are  shown. 

Figure  33  shows  partial-regions  found  for  a  Trout-Perch  shape.  Note  that 
Type  1  tokens  make  explicit  salient  negative  or  background  partial  regions, 
such  as  the  fork  of  the  tail,  as  well  as  regions  forming  parts  of  the  figure 
itself.  These  are  distinguished  by  the  sign  of  the  Type  1  parameter  within 
each  Type  1  token  (although  this  number  is  not  displayed).  Figures  34  and 
35  show  that  large  scale  partial-region  description  of  the  body  of  an  apple  is 
not  fazed  by  a  radical  alteration  in  the  bounding  contour  formed  when  the 
apple  is  hung  from  a  string,  nor  by  the  presence  of  a  nearby  object  such  as 
a  banana. 

Figures  33  through  35  also  show  that  the  Type  0  and  Type  1  grouping 
rules  interpret  the  scale  of  regions  and  the  scale  of  contours  in  a  different 

5The  term,  “primitive-comer”  is  used  to  emphasize  that  the  Type  1  shape  descrip¬ 
tion  occurs  independently  at  different  scales.  The  term,  “corner”  is  reserved  for  future 
descriptors  of  corner  shapes  integrating  information  across  several  scales. 
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manner.  Type  0  fine-to-coarse  aggregation  places  figuro/ground  boundaries 
at  a  coarse  scale  if  they  are  of  large  linear  (one-dimensional)  extent.  Thus, 
the  string  tied  to  the  apple  generates  coarse  scale  Type  0  tokens.  In  con¬ 
trast,  Type  1  partial-region  grouping  places  shape  features  at  a  coarse  scale 
according  to  their  two-dimensional  spatial  extent,  or  area.  Therefore  the 
string,  which  is  of  locally  small  area  because  of  its  narrow  width,  appears 
only  at  fine  sc;  ies  in  the  Type  1  representation. 

It  is  worth  noting  that  one  aspect  of  shape  structure  not  sought  by  the 
Type  1  grouping  rules  is  nonlocal  symmetry.  This  is  to  say,  structure  is  found 
only  at  distances  commensurate  with  the  scale  of  the  tokens  being  grouped. 
In  particular,  at  this  early  stage  no  attempt  is  made  to  identify  configurations 
such  as  shown  in  figure  36,  where  fine  scale  tokens  form  a  symmetrical  pair 
but  are  spaced  remotely  with  respect  to  their  scale.  This  attitude  bounds 
the  complexity  of  the  Type  1  grouping  operation  because  it  limits  the  neigh¬ 
borhood  within  which  to  search  for  other  Type  0  tokens  forming  a  Type  1 
configuration  with  any  given  Type  0  token.  The  spatial  and  scale  indexing 
provided  by  the  Scale-Space  Blackboard  provides  the  substrate  mechanism 
supporting  this  spatially  limited  search.  Because  the  neighborhood  of  a  Type 


Figure  36.  Type  1  grouping  does  not  attempt  to  group  pairs  of  edge  priini 
tives  located  remotely  with  respect  to  their  scale. 
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0  token  is  defined  in  terms  of  scale-normalized  distance,  that  is,  that  it’s  ab¬ 
solute  size  depends  upon  the  scale  of  the  Type  0  token  itself,  symmetrical 
configurations  spanning  large  distances  are  identified  by  the  Type  1  group¬ 
ing  rules,  but  only  when  their  component  Type  0  tokens  are  themselves  of 
a  large  scale.  This  scale-relative  quality  of  the  computation  arises  naturally 
from  the  property  of  self-similarity  across  scales  supported  by  the  scale-space 
representation. 

6  Conclusion 

This  paper  has  presented  an  alternative  to  numerical  smoothing  or  blur¬ 
ring  approaches  to  building  multiscale  shape  descriptions.  By  performing 
grouping  operations  on  symbolic  shape  tokens,  coarse  scale  structure  is  made 
explicit  based  on  information  present  at  finer  scales  of  description.  Unlike 
numerical  blurring,  however,  the  symbolic  grouping  rules  afford  substantial 
control  over  just  what  kinds  of  coarser  scale  structure  is  and  is  not  identified. 
As  a  result,  the  multiscale  description  of  an  object’s  shape  retains  stability 
under  the  presence  of  other  nearby  objects,  such  as  when  an  apple  is  placed 
near  a  banana,  and  under  disruptions  of  perceptually  salient  contours,  such 
as  when  an  apple  is  hung  from  a  string.  We  acknowledge  the  importance 
of  treating  regions  and  contours  as  complementary  aspects  of  shape  geome¬ 
try,  and  therefore  have  designed  distinct  operations  for  extracting  multiscale 
contour  and  region  information. 

In  the  course  of  developing  the  symbolic  grouping  approach  to  multiscale 
shape  representation,  we  have  introduced  the  Scale-Space  Blackboard  as  a 
tool  for  maintaining  and  accessing  spatial  information.  Shapes  are  repre¬ 
sented  in  terms  of  symbolic  tokens  placed  on  the  Blackboard.  This  strategy 
serves  as  a  step  toward  bridging  the  gulf  between  the  iconic  or  image-like 
representation  of  a  shape  implicit  in  an  array  of  pixels,  and  later  stages  of 
representation  making  use  of  purely  symbolic  data  structures.  The  tokens 
placed  on  the  Scale-Space  Blackboard  are  symbolic  in  that  they  may  contain 
not  just  a  grey-level  value,  but  frame  slots,  numbers,  lists,  and  pointers,  yet 
the  representation  is  image-like  in  that  the  Scale-Space  Blackboard  provides 
for  indexing  of  tokens  based  on  location  and  scale.  The  use  of  symbolic  to¬ 
kens,  spatially  arranged,  was  first  suggested  by  Marr  [1976]  in  his  discussion 
of  the  Primal  Sketch.  Although  Marr  recognized  the  significance  of  scale, 
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Figure  37.  “Spine  axes  computed  from  the  Type  1  tokens  in  figure  33  by  a 
very  simple  clustering  algorithm. 


possibility  of  interpreting  scale  as  a  distinct  dimension  in  addition  to  the  spa¬ 
tial  dimensions  was  not  elaborated  until  some  years  later  by  Witkin  [1983]. 
This  work  unites  these  two  ideas.  A  similar  approach  to  finding  extended 
straight  lines  in  grey-level  images  is  adopted  by  [Weiss  and  Boldt,  1986]  and 
[Boldt  and  Weiss,  1987]. 

The  stage  is  now  set  to  construct  additional  procedures  operating  over 
the  contents  of  the  Scale-Space  Blackboard  in  order  to  identify  more  complex 
and  more  abstract  geometric  events  and  shape  properties.  These  procedures 
may  write  new  tokens  onto  the  Blackboard,  with  token  types  corresponding 
to  the  properties  they  identify.  For  example,  one  commonly  sought  shape 
description  is  a  listing  of  an  object’s  “spines,”  or  part  axes.  Figure  37  shows 
axes  found  by  performing  a  very  simple  clustering  operation  on  the  Type 
1  tokens  of  figure  33.  These  spines  are  only  an  illustration  that  the  multi¬ 
scale  shape  description  delivered  does  indeed  support  the  extraction  of  more 
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complex  shape  entities;  the  proper  design  of  a  “spine  token  making  explicit 
taper,  spine  curvature,  and  so  forth  is  a  subject  for  further  work. 

Because  the  Scale-Space  Blackboard  retains  a  pictorial  quality  while  the 
symbolic  tokens  it  contains  may  represent  extended  spatial  events,  or  “chunks 
of  shape,  it  is  not  unlikely  that  this  approach  to  shape  representation  may 
also  serve  as  a  suitable  substrate  for  elemental  visual  operations  supporting 
Visual  Routines  [Ullman,  1983;  Mahoney,  1987). 
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